Design of VMS Volume Shadowing Phase II - Host-based Shadowing

نویسنده

  • Scott H. Davis
چکیده

VMS Volume Shadowing Overview Phase II is a fully distributed, clusterwide Volume shadowing is a data availability product technique that provides designed to replace the data availability to obsolete controller-based computer systems by shadowing implementation. protecting against Phase II is intended data loss from media to service current and deterioration, future generations of communication path storage architectures. failures, and controller In these architectures, or device failures. The there is no intelligent, process of volume shadowing multiunit controller that entails maintaining functions as a centralized multiple copies of the gateway to the multiple same data on two or more drives in the shadow physical volumes. Up to set. The new software three physical devices makes many additional are bound together by the topologies suitable for volume shadowing software shadowing, including and present a virtual DSSI drives, DSA drives, device to the system. This and shadowing across VMS device is referred to as MSCP servers. This last a shadow set or a virtual configuration allows shadow unit. The volume shadowing set members to be separated software replicates data by any supported cluster across the physical interconnect, including devices. All shadowing FDDI. All essential mechanisms are hidden from shadowing functions are the users of the system, performed within the VMS i.e., applications access operating system. New MSCP the virtual unit as if it controllers and drives can were a standard, physical optionally implement a set disk. Figure 1 shows a VMS of shadowing performance Volume Shadowing Phase II assists, which Digital set for a Digital Storage intends to support in Systems Interconnect (DSSI) a future release of the configuration of two VAX shadowing product. host computers. Digital Technical Journal Vol. 3 No. 3 Summer 1991 1 Design of VMS Volume Shadowing Phase II-Host-based Shadowing To support the range of configurations required by our customers, the new product had to be capable of shadowing physical devices located anywhere within a VAXcluster system Product Goals and of doing so in a The VMS host-based controller-independent shadowing project was fashion. The VAXcluster I/O undertaken because the system provides parallel original controller access to storage devices shadowing product from all nodes in a cluster is architecturally simultaneously. In order incompatible with many to meet its performance prospective storage devices goals, our shadowing and their connectivity product had to preserve requirements. Controller this semantic also. Figure shadowing requires an 2 shows clusterwide shadow intelligent, common sets for a hierarchical controller to access storage controller (HSC) all physical devices in configuration with multiple a shadow set. Devices computer interconnect (CI) such as the RF-series buses. When compared to integrated storage elements Figure 1, this figure shows (ISEs) with DSSI adapters a larger cluster containing and the RZ-series small several clusterwide shadow computer systems interface sets. Note that multiple (SCSI) disks present nodes in the cluster have configurations that direct, writable access to conflict with this method the disks comprising the of access. shadow sets. In addition to providing impact on the design of highly available access the host-based shadowing to shadow sets from implementation. Our goals anywhere in a cluster, to maximize application the new shadowing I/O availability during implementation had other transient states, to requirements. Phase II had provide customizable, to deliver performance event-driven design and comparable to that of fail-over, to enable all controller-based shadowing, cluster nodes to manage the maximize application I/O shadow sets, and to enhance availability, and ensure system disk capabilities data integrity for critical were all affected by applications. customer feedback. In designing the new Technical Challenges product, we benefited To provide volume from customer feedback shadowing in a VAXcluster about the existing environment running under implementation. This the VMS operating system feedback had a positive required that we solve 2 Digital Technical Journal Vol. 3 No. 3 Summer 1991 Design of VMS Volume Shadowing Phase II-Host-based Shadowing complex, distributed cluster. Membership and systems problems.[1] This state information about section describes the the shadow set is stored most significant technical on all physical members in challenges we encountered an on-disk data structure and the solutions we called the storage control arrived at during the block (SCB). One way design and development that shadowing uses this of the product. SCB information is to Membership Consistency. automatically determine To ensure the level of the most up-to-date shadow integrity required for set member(s) when the set high availability systems, is created. In addition to the shadowing design distributed synchronization must guarantee that a primitives, the VMS shadow set has the same lock manager provides a membership and states on capability for managing a all nodes in the cluster. distributed state variable A simple way to guarantee called a lock value block. this property would have Shadowing uses the lock been a strict clientvalue block to define a server implementation, disk that is guaranteed where one VAX computer to be a current member of serves the shadow set the shadow set. Whenever to the remainder of the a membership change is cluster. This approach, made, all nodes take part however, would have in a protocol of lock violated several design operations; the value block goals; the intermediate and the on-disk SCB are hop required by data the final arbiters of set transfers would decrease constituency. system performance, Sequential Commands. A and any failure of the sequential I/O command, serving CPU would require i.e., a Mass Storage a lengthy fail-over and Control Protocol (MSCP) rebuild operation, thus concept, forces all negatively impacting system commands in progress availability. to complete before To solve the problem of the sequential command membership consistency, we begins execution. While used the VMS distributed a sequential command lock manager through a is pending, all new I/O new executive threadrequests are stalled level interface.[2,3] We until that sequential designed a set of eventcommand completes driven protocols that execution. Shadowing shadowing uses to guarantee requires the capability membership consistency. to execute a clusterwide, These protocols allowed sequential command during us to make the shadow certain operations. This set virtual unit a local capability, although device on all nodes in the a simple design goal Digital Technical Journal Vol. 3 No. 3 Summer 1991 3 Design of VMS Volume Shadowing Phase II-Host-based Shadowing for a client-server on application I/O implementation, is performance. a complex one for a Merge Operations. Merge distributed access model. operations are triggered We chose an event-driven, when a CPU with write request/response protocol access to a shadow set to create the sequential fails. (Note that with command capability. controller shadowing, Since sequential commands merge operations are have a negative impact on copy operations that are performance, we limited triggered when an HSC the use of these commands fails.) Devices may still to performing membership be valid members of the changes, mount/dismount shadow set but may no operations, and bad block longer be identical, due and merge difference to outstanding writes in repairs. Steady state progress when the host CPU processing never requires failed. The merge operation using sequential commands. must detect and correct Full Copy. A full copy these differences, so that is the means by which a successive application new member of the shadow reads for the same set is made current with data produce consistent the rest of the set. The results. As for full copy challenge is to make copy operations, the challenge operations unintrusive; with merge processing is application I/Os must to generate consistent proceed with minimal results with minimal impact so that the level impact on application I/O of service provided by the performance. system is both acceptable Booting and Crashing. and predictable. VMS System disk shadowing file I/O provides recordpresents some special level sharing through the problems because the shadow application transparent set must be accessible locking provided by the to CPUs in the cluster VAX RMS software, Digital's when locking protocols and record management services. inter-CPU communication are Shadowing operates at the disabled. In addition, physical device level to crashing must ensure handle a variety of lowappropriate behavior for level errors. Because writing crash dumps through shadowing has no knowledge the primitive bootstrap of the higher-layer driver, including how record locking, a copy to propagate the dump to operation must guarantee the shadow set. It was that the application I/Os not practical to modify and the copy operation the bootstrap drivers itself generate the because they are stored in correct results and do read-only memory (ROM) on so with minimal impact various CPU platforms that shadowing would support. 4 Digital Technical Journal Vol. 3 No. 3 Summer 1991 Design of VMS Volume Shadowing Phase II-Host-based Shadowing Error Processing. speeds. Controller One major function of shadowing does not volume shadowing is provide this capability. to perform appropriate o Allows each node in error processing for the cluster to perform members of the shadow error recovery based set, while maximizing data on access to physical availability. To carry out data source members. this function, the software The shadowing software must prevent deadlocks treats communication between nodes and decide failures between any when to remove devices from cluster node and shadow the shadow set. We adopted set members as normal a simple recovery ethic: a shadowing events with node that detects an error customer-definable is responsible for fixing recovery metrics. that error. Membership changes are serialized in Major Components the cluster, and a node VMS Volume Shadowing Phase only makes a membership II consists of two major change if the change is components: SHDRIVER and accompanied by improved SHADOW_SERVER. SHDRIVER is access to the shadow the shadowing virtual unit set. A node never makes driver. As a client of disk a change in membership class drivers, SHDRIVER is without having access to responsible for handling some source members of the all I/O operations that are set. directed to the virtual unit. SHDRIVER issues Architecture physical I/O operations Phase II shadowing provides to the disk class driver a local virtual unit on to satisfy the shadow each node in the cluster set virtual unit I/O with distributed control requests. SHDRIVER is also of that unit. Although responsible for performing the virtual unit is not all distributed locking and served to the cluster, the for driving error recovery. underlying physical units SHADOW_SERVER is a VMS that constitute a shadow ancillary control process set are served to the (ACP) responsible for cluster using the standard driving copy and merge VMS mechanisms. This scheme operations performed on has many data availability the local node. Only one advantages. The Phase II optimal node is responsible design for driving a copy or merge o Allows shadowing to use operation on a given shadow all the VMS controller set, but when a failure fail-over mechanisms for occurs the operation will physical devices. As a fail over and resume on result, member fail-over another CPU. Several approaches hardware factors determine this optimal node including the Digital Technical Journal Vol. 3 No. 3 Summer 1991 5 Design of VMS Volume Shadowing Phase II-Host-based Shadowing types of access paths, and sequential stall requests controllers for the members to other nodes that have and user-settable, per-node the shadow set mounted. copy quotas. This initiating thread Primitives waits until all other nodes in the cluster have flushed This section describes their I/Os and responded the locking protocols and to the node requesting the error recovery processing sequential operation. Once functions that are used all nodes have responded by the shadowing software. or left the cluster, the These primitives provide operations that compose basic synchronization and the sequential command recovery mechanisms for execute. When this process shadow sets in a VAXcluster is complete, the locks system. are released, allowing Locking Protocols. The asynchronous threads on the shadowing software uses other nodes to proceed and event-driven locking automatically resume I/O protocols to coordinate operations. The local node clusterwide activity. These resumes I/O as well. request/response protocols Error Recovery Processing. provide maximum application Error recovery processing I/O performance. A VMS is triggered by either executive interface to the asynchronous notification distributed lock manager of a communication failure allows shadowing to make or a failing I/O operation efficient use of locking directed towards a physical directly from SHDRIVER. member of the shadow set. One example of this use of Two major functions of locking protocols in VMS error recovery are built Volume Shadowing Phase II into the virtual unit is the sequential command driver: active and passive protocol. As mentioned in volume processing. the Technical Challenges Active volume processing section, shadowing requires is triggered directly by the sequential command events that occur on a capability but minimizes local node in the cluster. the use of this primitive. This type of volume Phase II implements the processing uses a simple, capability by using several localized ethic for error locks, as described in the recovery from communication following series of events. or controller failures. A node that needs to Shadow set membership execute a sequential decisions are made locally, command first stalls based on accessibility. If I/O locally and flushes no members of a shadow set operations in progress. are currently accessible The node then performs from a node, then the lock operations that ensure membership does not change. serialization and sends If some but not all members of the set are accessible, 6 Digital Technical Journal Vol. 3 No. 3 Summer 1991 Design of VMS Volume Shadowing Phase II-Host-based Shadowing the local node, after volume processing, the attempting fail-over, I/O requests are stalled removes some members to because the membership of allow application I/O to the set is in doubt, and proceed. The system manager correct processing of the sets the time period during request cannot be performed which members may attempt until the situation is fail-over. The actual corrected. removal operation is a sequential command. The Steady State Processing design allows for maximum flexibility and quick error The shadowing virtual unit recovery and implicitly driver receives application avoids deadlock scenarios. read and write requests Passive volume processing and must direct the I/O responds to events that appropriately. This section occur elsewhere in the describes these steady cluster; messages from state operations. nodes other than the local Read Algorithms one trigger the processing The shadowing virtual unit by means of the shadowing driver receives application distributed locking read requests and directs protocols. This volume a physical I/O to an processing function is appropriate member of the responsible for verifying set. SHDRIVER attempts the shadow set membership to direct the I/O to the and state on the local optimum device based on node and for modifying this locally available data. membership to reflect any This decision is based on changes made to the set by (1) the access path, i.e., the cluster. To accomplish local or served by the VMS these operations, the operating system, (2) the shadowing software first service queue lengths at reads the lock value block the candidate controller, to find a disk guaranteed and (3) a round-robin to still be in the shadow algorithm among equal set. Then the recovery paths. Figure 3 shows a process retrieves the shadow set read operation. physical member's on-disk An application read to the SCB data and uses this shadow set causes a single information to perform the physical read to be sent relevant data structure to an optimal member of the updates on the local node. set. In Figure 3, there is Application I/O requests one local and one remote to the virtual unit are member, so the read is sent always stalled during to the local member. volume processing. In the case of active volume processing, the stalling is necessary because many I/Os would fail until the error was corrected. In passive Digital Technical Journal Vol. 3 No. 3 Summer 1991 7 Design of VMS Volume Shadowing Phase II-Host-based Shadowing Data repair operations caused by media defects are triggered by a read operation failing with an appropriate error, such as forced error or parity. The shadowing driver attempts this repair using another member of the shadow set. This repair operation is performed with the synchronization of a sequential command. Sequential protection is required because a read operation is being converted into a write operation without explicit, RMS-layer synchronization.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

LIMIT AVERAGE SHADOWING AND DOMINATED SPLITTING

In this paper the notion of limit average shadowing property is introduced for diffeomorphisms on a compact smooth manifold M and a class of diffeomorphisms is given which has the limit average shadowing property, but does not have the shadowing property. Moreover, we prove that for a closed f-invariant set Lambda  of a diffeomorphism f, if Lambda is C1-stably limit average shadowing and t...

متن کامل

On Periodic Shadowing Property

In this paper, some properties of the periodic shadowing are presented. It is shown that a homeomorphism has the periodic shadowing property if and only if so does every lift of it to the universal covering space. Also, it is proved that continuous mappings on a compact metric space with the periodic shadowing and the average shadowing property also have the shadowing property and then are chao...

متن کامل

Shadowing and Scaffolding Techniques Affecting L2 Reading Comprehension

Scaffolding and shadowing techniques have been shown to improve language learners’ reading comprehension. However, little attention has been paid to the comparative effectiveness of these techniques. This study investigated the effect of three selected scaffolding techniques (peer scaffolding, distributed scaffolding, and reciprocal scaffolding) versus three types of shadowing (complete shadowi...

متن کامل

The Effect of Teaching Metacognitive Listening Strategy during Shadowing Activity on Field-Dependent and Field-Independent EFL Learners’ Listening Comprehension

This study aimed to compare the effect of teaching metacognitive listening strategies through shadowing activity on the listening comprehension of field-dependent (FD) and field-independent (FI) EFL learners. Since the researcher had access only to female participants,85 female EFL learners from a language institute in Tehran, at the pre-intermediate level of proficiency with the age range of 1...

متن کامل

The ergodic shadowing property for robust and generic volume-preserving diffeomorphisms

In this paper, we show the followings: (i) If a volume preserving diffeomorphism f belongs to the C-interior of the set of all volume preserving diffeomorphims having the ergodic shadowing property then it is transitive Anosov. Moreover, (ii) if a C-generic volume-preserving diffeomorphism f has the ergodic shadowing property then it is transitive Anosov. M.S.C. 2010: 37C50, 37D20.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Digital Technical Journal

دوره 3  شماره 

صفحات  -

تاریخ انتشار 1991